This function performs a block read from an instance of the ADO.NET object System.Data.DataTable. This object may only be read using an explicit row-wise loop, which is slow at the APL level. 2011⌶ implements an internal row-wise loop which is much faster on large arrays. Furthermore, the function handles NULL values and the conversion of .NET datatypes to the appropriate internal APL form in a more efficient manner than can otherwise be achieved. These 3 factors together mean that the function provides a significant improvement in performance compared to calling the row-wise programming interface directly at the APL level.
Y is a scalar or a 2-item array containing:
The optional argument X is Boolean vector, where a 1 indicates that the corresponding column of the result should be converted to a string using the ToString
method of the data type of column in question. It is envisaged that this argument may be extended in the future, to allow other conversions – for example converting Dates to a floating-point format
The result R is a matrix with the same shape as the DataTable identified in Y.
First for comparison is shown the type of code that is required to read a DataTable by looping:
t←3⊃⎕AI ⋄ data1←↑(⌷dt.Rows).ItemArray ⋄ (3⊃⎕AI)-t 191
The above expression turns the dt.Rows collection into an array using ⌷, and mixes the ItemArray properties to produce the result. Although here there is no explicit loop, involved, there is an implicit loop required to reference each item of the collection in succession. This operation performs at about 200 rows/sec.
2011⌶ does the looping entirely in compiled code and is significantly faster:
GetDT←2011⌶ t←3⊃⎕AI ⋄ data2←GetDT dt ⋄ (3⊃⎕AI)-t 25
In the first example shown above, 2011⌶ created 365 instances of System.DateTime objects in the workspace. If we are willing to receive the timestamps in the form of strings, we can read the data almost an order of magnitude faster:
t←3⊃⎕AI ⋄ data3←0 0 0 1 GetDT dt ⋄ (3⊃⎕AI)-t 3
The left argument to 2011⌶ allows you to flag columns which should be returned as the ToString()
value of each object in the flagged columns. Although the resulting array looks identical to the original, it is not: The fourth column contains character vectors:
¯2 4↑data3 364 even 4 18-01-2011 14:03:29 365 odd 5 19-01-2011 14:03:29
Depending on your application, you may need to process the text in the fourth column in some way – but the overall performance will probably still be very much better than it would be if DateTime objects were used.
Using the DataTable produced by the corresponding example shown for 2010⌶ it can be shown that by default null values will be read back into the APL workspace as instances of System.DBNull
.
GetDT←2011⌶> ⎕←z←GetDT dt odd 1 21-01-2010 14:50:19 two 2 22-01-2010 14:50:19 three odd 23-01-2010 14:50:19 (1 1⍉z).GetType System.DBNull System.DBNull System.DBNull
However, by supplying a left argument to 2011⌶, we can request that nulls in each column are mapped to a corresponding value of our choice; in this case, '<null>', 'even', and 99 respectively.
GetDT dt ('<null>' 'even' 99 '') <null> odd 1 21-01-2010 14:50:19 two even 2 22-01-2010 14:50:19 three odd 99 23-01-2010 14:50:19